Building a reordering system using tree-to-string hierarchical model

نویسندگان

  • Jacob Dlougach
  • Irina Galinskaya
چکیده

This paper describes our submission to the First Workshop on Reordering for Statistical Machine Translation. We have decided to build a reordering system based on tree-tostring model, using only publicly available tools to accomplish this task. With the provided training data we have built a translation model using Moses toolkit, and then we applied a chart decoder, implemented in Moses, to reorder the sentences. Even though our submission only covered English-Farsi language pair, we believe that the approach itself should work regardless of the choice of the languages, so we have also carried out the experiments for English-Italian and English-Urdu. For these language pairs we have noticed a significant improvement over the baseline in BLEU, Kendall-Tau and Hamming metrics. A detailed description is given, so that everyone can reproduce our results. Also, some possible directions for further improvements are discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Discriminative Syntactic Model for Source Permutation via Tree Transduction

A major challenge in statistical machine translation is mitigating the word order differences between source and target strings. While reordering and lexical translation choices are often conducted in tandem, source string permutation prior to translation is attractive for studying reordering using hierarchical and syntactic structure. This work contributes an approach for learning source strin...

متن کامل

Constituent Reordering and Syntax Models for English-to-Japanese Statistical Machine Translation

We present a constituent parsing-based reordering technique that improves the performance of the state-of-the-art English-to-Japanese phrase translation system that includes distortion models by 4.76 BLEU points. The phrase translation model with reordering applied at the pre-processing stage outperforms a syntax-based translation system that incorporates a phrase translation model, a hierarchi...

متن کامل

Dependency Graph-to-String Translation

Compared to tree grammars, graph grammars have stronger generative capacity over structures. Based on an edge replacement grammar, in this paper we propose to use a synchronous graph-to-string grammar for statistical machine translation. The graph we use is directly converted from a dependency tree by labelling edges. We build our translation model in the log-linear framework with standard feat...

متن کامل

DIAGNOSIS OF BREAST LESIONS USING THE LOCAL CHAN-VESE MODEL, HIERARCHICAL FUZZY PARTITIONING AND FUZZY DECISION TREE INDUCTION

Breast cancer is one of the leading causes of death among women. Mammography remains today the best technology to detect breast cancer, early and efficiently, to distinguish between benign and malignant diseases. Several techniques in image processing and analysis have been developed to address this problem. In this paper, we propose a new solution to the problem of computer aided detection and...

متن کامل

Deep Syntax Language Models and Statistical Machine Translation

Hierarchical Models increase the reordering capabilities of MT systems by introducing non-terminal symbols to phrases that map source language (SL) words/phrases to the correct position in the target language (TL) translation. Building translations via discontiguous TL phrases increases the difficulty of language modeling, however, introducing the need for heuristic techniques such as cube prun...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1302.3057  شماره 

صفحات  -

تاریخ انتشار 2012